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This  study  sought  to  evaluate  the  effect  of  speech  intensity  on  performance  of  the  Callsign  Acquisition 
Test  (CAT)  and  Modified  Rhyme  Test  (MRT)  presented  in  noise.  Fourteen  normally  hearing  listeners 
performed  both  tests  in  65  dB  A  white  background  noise.  Speech  intensity  varied  while  background  noise 
remained  constant  to  form  speech-to-noise  ratios  (SNRs)  of  —18,  —15,  —12,  —9,  and  —6  dB.  Results 
showed  that  CAT  recognition  scores  were  significantly  higher  than  MRT  scores  at  the  same  SNRs;  how¬ 
ever,  the  scores  from  both  tests  were  highly  correlated  and  their  relationship  for  the  SNRs  tested  can  be 
expressed  by  a  simple  linear  function.  The  concept  of  CAT  can  be  easily  ported  to  other  languages  for 
testing  speech  communication  under  adverse  listening  conditions. 
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1.  Introduction 

Speech  intelligibility  (SI)  is  defined  as  the  percent¬ 
age  of  speech  units  (i.e.,  phonemes,  syllables,  words, 
phrases,  or  sentences)  that  may  be  correctly  identified 
by  a  listener  (Letowski  et  al ,  2001).  Several  differ¬ 
ent  SI  tests  are  currently  utilized  in  both  research  and 
practice.  One  of  the  more  recent  English  language  SI 
tests  is  the  Callsign  Acquisition  Test  (CAT)  developed 
by  the  United  States  Army  Research  Laboratory.  The 
CAT  has  similar  general  applications  as  the  Modified 
Rhyme  Test  (MRT)  (Fairbanks,  1958;  House  et  al. , 
1965),  which  is  widely  used  for  assessing  SI  in  easy  to 
moderate  speech  communication  conditions.  The  pri¬ 
mary  goal  of  the  CAT  is  to  predict  SI  of  military  com¬ 
munications  in  difficult  listening  environments  charac¬ 
terized  by  poor  signal-to-noise  ratios. 

The  current  version  of  CAT  has  been  used  in  several 
studies  and  evaluated  by  multiple  researchers  (Blue 
et  al. ,  2004;  2010;  Rao,  Letowski,  2006);  however, 


since  it  is  a  relatively  new  instrument,  it  is  still  lacking 
full  validation  and  standardization.  The  standardiza¬ 
tion  process  of  any  new  SI  test  involves,  among  other 
things,  determining  test  validity  and  sensitivity;  eval¬ 
uating  the  effects  of  noise,  talker’s  voice,  and  listen¬ 
ing  environment;  and  comparing  its  scores  to  scores 
obtained  with  existing  SI  tests.  Various  technical  and 
procedural  factors  that  affect  the  scores  obtained  with 
any  SI  test  material  include  speech  intensity  level 
and  speech-to-noise  ratio  (SNR).  The  objective  of  the 
present  study  was  to  measure  the  effectiveness  of  the 
CAT  and  compare  it  to  the  MRT  across  various  SNRs 
in  the  presence  of  a  65  dB  A  white  noise. 

2.  Methodology 

2.1.  Participants 

A  total  of  14  normally  hearing  listeners  partici¬ 
pated  in  the  study.  Normal  hearing  was  defined  as 
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pure-tone  hearing  thresholds  at  or  below  20  dB  HL 
at  audiometric  octave  frequencies  from  250  through 
8000  Hz.  The  group  was  comprised  of  8  male  and  6  fe¬ 
male  listeners  between  the  ages  of  18  and  25  years. 

2.2.  Instrumentation 

The  study  was  conducted  in  an  Industrial  Acoustic 
Company  (IAC)  143M  audiometric  booth.  Instrumen¬ 
tation  for  the  research  included  (1)  a  Dell  IBM  PC/586 
computer  with  a  CD  ROM  drive,  (2)  two  Hewlett- 
Packard  HP-350D  step  attenuators,  (3)  a  Crown  D-75 
power  amplifier,  (4)  a  CD  ROM  with  test  materials 
and  in-house  CAT  and  MRT  software  for  speech  sig¬ 
nal  delivery  and  data  collection,  and  (5)  a  pair  of  AKG 
K-1000  earphones.  A  KEMAR  (Knowles  Electronic 
Manikin  for  Acoustic  Research)  simulator  with  a  Zwis- 
locki  coupler  (ANSI  S3. 25)  was  used  to  measure  sound 
pressure  levels  generated  at  the  ear  of  the  listener. 

The  test  materials  were  installed  on  the  Dell  com¬ 
puter.  Both  the  speech  signals  and  noise  were  played 
through  a  multi-channel  sound  card  (Turtle  Beach  - 
Santa  Cruz).  The  speech  signals  were  played  through 
one  channel  and  the  noise  signal  was  played  through 
the  second  channel.  The  speech  and  noise  levels  were 
controlled  by  two  independent  HP-350D  step  attenua¬ 
tors.  Once  both  sound  pressure  level  files  were  adjusted 
to  the  proper  levels,  they  were  played  through  the  ear¬ 
phones. 

2.3.  Test  materials 

The  Modified  Rhyme  Test  (MRT)  (Fairbanks, 
1958;  House  et  al. ,  1965)  is  the  most  frequently  used  SI 
test  for  evaluating  transmission  capabilities  of  acoustic 
and  audio  systems.  The  test  uses  a  battery  of  50  sets 
of  6  one-syllable  rhyming  or  similar  sounding  words  to 
test  initial  and  final  consonant  recognition.  During  the 
test,  one  of  the  words  from  the  list  is  presented  to  the 
listener  verbally  and  the  listener  is  required  to  indicate 
which  one  of  the  six  words  in  the  list  was  presented. 

The  Callsign  Acquisition  Test  (CAT)  was  devel¬ 
oped  by  the  United  States  Army  Research  Laboratory 
in  response  to  criticisms  that  widely  used  SI  testing 
materials  are  not  effective  in  certain  contexts,  par¬ 
ticularly  military  environments,  which  are  noisy  and 
characterized  by  limited  vocabulary  communications. 
Several  authors  have  reported  that  the  use  of  military 
personnel  in  SI  studies  requires  military-specific  test 
material  to  generate  reliable  scores  (Rao,  Letowski, 
2006;  Howes,  1957).  The  CAT  combines  two-syllable 
words  based  on  the  phonetic  alphabet  with  one  syllable 
numeric  digits  to  form  a  total  of  126  three- syllable  al¬ 
phanumeric  calling  phrases  (callsigns).  They  constitute 
a  family  of  test  items  that  is  familiar  to  both  military 
personnel  and  civilians,  making  it  useful  both  inside 
and  outside  of  military  environments. 


Both  the  MRT  and  CAT  recordings  used  in  the 
study  were  made  at  the  U.S.  Army  Research  Labora¬ 
tory  by  the  same  native  English  male  talker  speaking 
with  a  Midwestern  accent.  The  listeners  were  familiar¬ 
ized  with  both  tests’  materials  prior  to  the  study  to 
avoid  learning  curve  effects  and  to  make  both  tests 
equally  familiar  to  the  listeners  since  research  has 
shown  that  familiarity  with  the  test  material  results  in 
higher  and  more  stable  SI  test  scores  (Howes,  1957; 
Morton,  1969;  Schultz,  1964). 

2.4 ■  Noise  and  speech  levels 

White  noise  presented  at  a  constant  65  dB  A  level 
was  used  in  the  study  as  background  noise.  The  level 
was  selected  to  be  close  to  the  normal  conversational 
level  of  speech  so  the  naturally  spoken  speech  materi¬ 
als  could  be  used  to  produce  small  SNRs.  The  MRT 
and  CAT  test  items  were  presented  at  five  different  in¬ 
tensities  -  47,  50,  53,  56,  and  59  dB  A  -  resulting  in 
SNRs  of  —18,  —15,  —12,  —9,  and  —6  dB.  Speech  levels 
were  determined  by  averaging  dB  A  levels  measured 
separately  for  each  word. 

2.5.  Procedure 

Each  listener  was  seated  inside  the  acoustically 
treated  booth  facing  a  monitor  and  keyboard  and  wear¬ 
ing  earphones.  Prior  to  data  collection,  each  partic¬ 
ipant  read  written  instructions  of  their  tasks,  was  fa¬ 
miliarized  with  the  test  material,  and  given  10  practice 
trials  for  each  speech  test. 

The  listener’s  task  was  to  listen  to  the  series  of  CAT 
or  MRT  items  and  use  the  appropriate  computer  screen 
interface  (shown  in  Fig.  1)  and  keyboard  to  record  their 
responses  of  what  they  heard.  They  were  instructed  to 


Fig.  1.  Screen  captures  of  interfaces  for  CAT  (top) 
and  MRT  (bottom)  software. 
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identify  the  words  that  they  heard  using  the  keyboard. 
For  example,  if  the  listener  heard  “Zulu  Two”  from 
the  CAT,  the  correct  response  would  be  “Z2;  Enter”. 
Pressing  the  “Enter”  key  would  store  their  response  as 
well  as  start  the  next  trial.  For  the  MRT,  if  the  listener 
heard  the  word  “din”  from  the  list  as  it  appears  in 
Fig.  1,  the  correct  response  was  “5;  Enter”.  If  they 
were  unsure  of  what  they  heard,  they  were  instructed 
to  make  their  best  guess. 

3.  Results 

Table  1  shows  the  means  (M),  standard  deviations 
(SD),  and  coefficients  of  variation  (V)  for  each  ex¬ 
perimental  condition.  The  performance-intensity  (PI) 
functions  describing  the  relationship  between  the 
speech  intelligibility  score  and  SNR  for  both  the  MRT 
and  CAT  are  shown  in  Fig.  2.  To  determine  statisti¬ 
cal  significance,  all  percentage  scores  were  transformed 
into  rau  units  (Studebaker,  1985)  in  order  to  elim¬ 
inate  the  potential  of  ceiling  effects  associated  with 
the  SI  scale.  An  alpha  level  of  0.05  was  used  to  deter¬ 
mine  significance  for  all  statistical  tests.  A  two- factor 
ANOVA  shows  that  the  type  of  test  had  a  signifi¬ 
cant  effect  on  the  SI  performance  [F(l,130)  =  163.00, 
p  <  0.001]  as  did  the  SNR  [F(4,130)  =  43.95,  p  < 
0.001];  however,  there  was  no  significant  interaction 
between  the  two  [F(4,130)  =  1.08,  p  =  0.367]. 


Table  1.  Mean  (M),  standard  deviation  (SD),  and 
coefficient  of  variation  (V)  for  CAT  and  MRT. 


SNR 

dB] 

-18 

-15 

-12 

-9 

-6 

CAT 

M 

65.57 

77.38 

86.64 

97.07 

98.93 

SD 

17.03 

18.44 

14.09 

3.17 

2.3 

V 

26.42 

29.02 

24.97 

7.78 

4.88 

MRT 

M 

35.79 

48.71 

61.29 

68.86 

78.43 

SD 

17.76 

18.52 

16.49 

15.42 

13.93 

V 

48.70 

36.09 

26.38 

23.13 

20.62 

NOTE:  Coefficient  of  variation  (V)  has  been  caicu- 
lated  using  rau  scores. 


Speech-to- Noise  Ratio  {dB} 

Fig.  2.  Perfomance-intensity  functions  for  CAT  and  MRT. 


A  correlation  analysis  was  performed  to  evaluate 
the  relationship  between  the  CAT  and  MRT  perfor¬ 
mance  scores  in  the  tested  range  of  SNRs.  The  Pear¬ 
son’s  correlation  coefficient  shows  that  the  two  tests 
have  a  high  positive  relationship  [r(12)  =  0.84,  p  < 
0.001],  which  validates  the  parallel  shift  in  the  PI  func¬ 
tions  shown  in  Fig.  2. 

4.  Discussion 

Based  on  Fig.  2  and  the  correlation  analysis  results, 
both  PI  functions  have  similar  shapes  and  slopes  in 
the  tested  range  of  SNRs.  The  SI  performance  for  the 
CAT  increases  by  about  3-5%/dB  SNR  from  SNRs 
—  18  to  —12  dB  before  beginning  to  plateau.  Simarly, 
the  MRT  increases  by  about  2-4%/dB  SNR  throuout 
its  range.  Nonlinear  regression  analysis  was  used  to 
determine  fitted  equations  for  both  lines.  Equations  (1) 
and  (2)  show  the  fitted  equations  for  the  CAT  and 
MRT,  respectively.  The  fitted  lines  with  the  original 
data  are  shown  in  Fig.  3. 

CAT  SCORE  =  86.72  -  4.49(SNR)  -  0.44(SNR)2 

-  0.0064(SNR)3,  R2  =  0.995,  (1) 


MRT  SCORE  =  98.91  +  4.18(SNR)  +  0.17(SNR)2 

+  0.0072(SNR)3,  R2  =  0.998.  (2) 


Fig.  3.  Performance- intensity  functions  (solid  lines)  and 
regression  functions  (dashed  lines)  for  CAT  and  MRT. 


Seeing  as  both  functions  have  such  similar  shapes 
and  slopes,  a  basic  model  to  predict  CAT  SI  perfor¬ 
mance  in  65  dB  A  of  white  noise  from  MRT  perfor¬ 
mance  data  can  be  formulated  for  speech  presented 
within  a  range  of  47  to  59  dB  A.  The  average  difference 
between  the  SI  scores  at  SNRs  tested  was  25.7%;  there¬ 
fore,  on  the  basis  of  the  data  collected  in  this  study, 
an  upward  shift  of  the  MRT  scores  by  26%  results  in 
a  good  estimate  of  the  CAT  scores.  That  is, 

CAT  SCORE  =  26  +  MRT  SCORE.  (3) 
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It  should  also  be  noted  that  the  coefficients  of  vari¬ 
ation  for  the  CAT  are  much  lower  than  those  for  the 
MRT  (see  Table  1)  indicating  greater  repeatability  of 
the  CAT  test  data.  However,  it  is  important  to  stress 
that  theoretical  shapes  of  the  MRT  and  CAT  PI  func¬ 
tions  are  not  parallel  and  such  approximate  parallel 
behavior  has  been  only  assumed  for  practical  purposes 
and  for  the  limited  range  of  SNRs  investigated  in  this 
study.  The  MRT  test  is  a  6-alternative  test  with  a  cor¬ 
rect  guess  ratio  of  1/6  (16.6%)  while  the  CAT  is  a  126- 
alternative  test  with  a  guess  ratio  of  1/126  (0.8%).  This 
difference  in  correct  guess  ratios  causes  the  shapes  of 
their  respective  PI  functions  to  be  very  different  at 
low  SNR  levels  and  the  two  functions  can  be  only  ap¬ 
proximated  as  parallel  in  a  relatively  narrow  range  in 
mid-to-high  SNRs  as  reported  in  this  study.  The  theo¬ 
retical  shapes  of  both  the  MRT  and  CAT  PI  functions 
based  on  the  data  reported  in  this  study  are  shown  in 
Fig.  4. 


Fig.  4.  Theoretical  shapes  of  MRT  and  CAT  PI  functions 
derived  for  the  data  reported  in  the  study. 

The  original  MRT  data  and  shifted  performance- 
intensity  functions  for  the  CAT  are  shown  in  Fig.  5.  If 
the  MRT  function  was  shifted  to  match  CAT  data,  the 


Fig.  5.  Original  MRT  PI  function  and  shifted  CAT  PI 
function  matching  MRT  data. 


shifted  MRT  function  woud  reach  100%  intelligibility 
around  —  7  dB  SNR,  which  corresponds  to  about  75% 
intelligibility  for  the  original  MRT  function.  Thus,  the 
applicability  of  Eq.  (3)  under  the  conditions  used  in 
this  study  is  limited  to  the  SNRs  between  —18  and 
—6  dB.  In  addition,  the  validity  of  this  equation  may  be 
limited  to  speech  levels  below  70  dB  SPL  since  above 
this  level  the  signal  level,  in  addition  to  SNR,  affects 
speech  intelligibility  (Studebaker  et  al ,  1999). 

The  CAT  scores  obtained  in  this  study  closely  agree 
with  the  data  reported  previously  for  a  similar  range 
of  SNRs  by  Rao  and  Letowski  (Rao,  Letowski, 
2003;  2006).  Likewise,  reported  MRT  scores  are  simi¬ 
lar  to  those  that  would  be  predicted  from  the  normal 
cumulative  fit  to  the  House  et  al.  (1965)  data  as  well 
as  to  those  reported  by  Zera  (2004)  for  a  pink  noise 
masker  and  Williams  and  Hecker  (1968)  for  an  ad¬ 
ditive  speech-shaped  noise.  Similar  data  were  also  re¬ 
ported  by  Nickerson  et  al.  (1960)  for  the  Fairbanks 
Rhyme  Test  presented  in  random  noise.  Some  small 
differences  between  data  reported  in  our,  these,  and 
other  studies  result  possibly  from  differences  in  mask¬ 
ing  noise  and  SNR  measurement  methods  employed  in 
these  studies. 


5.  Summary  and  conclusions 

The  objective  of  the  presented  study  was  to  com¬ 
pare  the  effects  of  SNR  on  the  SI  scores  of  the  CAT  and 
MRT  tests  conducted  in  65  dB  A  white  background 
noise.  As  expected,  the  results  showed  that  both  the 
type  of  speech  test  and  SNR  have  significant  effects 
on  SI  scores.  Further  analysis  showed  that  the  CAT 
SI  scores  were  significantly  higher  and  relatively  less 
variable  than  the  MRT  SI  scores  for  the  same  SNRs. 
In  addition,  the  study  revealed  a  strong  positive  rela¬ 
tionship  between  the  CAT  and  MRT  scores  across  the 
tested  range  of  SNRs. 

Due  to  the  fact  that  the  MRT  and  CAT  scores  are 
highly  correlated  and  both  tests  result  in  similar  data 
variability  patterns  for  the  SNRs  tested,  it  can  be  con¬ 
cluded  that  the  PI  functions  of  both  tests  have  ap¬ 
proximately  equivalent  shapes  in  the  — 18  dB  to  —  6  dB 
SNR  range  when  both  tests  are  used  in  the  presence 
of  a  65  dB  A  white  noise.  By  adding  a  26%  constant 
to  the  MRT  score  (Eq.  (3))  we  can  predict  the  CAT 
score  under  the  test  conditions  evaluated  in  the  current 
study  and  maintain  continuity  of  the  data  pattern  us¬ 
ing  MRT  at  better  SNRs  if  needed.  The  use  of  CAT 
in  place  of  MRT  for  adverse  military  listening  condi¬ 
tions  below  —6  dB  SNR  saves  time  and  increases  data 
repeatability.  One  of  the  important  properties  of  the 
CAT  is  its  simple  vocabulary  that  may  be  easily  ported 
to  other  languages.  It  is  expected  that  CAT  data  may 
be  relatively  language  independent  but  this  concept 
has  yet  to  be  tested. 
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